Adaptive Semiparametric Language Models

نویسندگان

چکیده

Abstract We present a language model that combines large parametric neural network (i.e., transformer) with non-parametric episodic memory component in an integrated architecture. Our uses extended short-term context by caching local hidden states—similar to transformer-XL—and global long-term retrieving set of nearest neighbor tokens at each timestep. design gating function adaptively combine multiple information sources make prediction. This mechanism allows the use either context, memory, or (or any combination them) on ad hoc basis depending context. Experiments word-based and character-based modeling datasets demonstrate efficacy our proposed method compared strong baselines.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Adaptive Methods for Spatial Scan Analysis via Semiparametric Mixture Models

Spatial scan density (SSD) estimation via mixture models is an important problem in the Ž eld of spatial statistical analysis and has wide applications in image analysis. The “borrowed strength” density estimation (BSDE) method via mixturemodels enables one to estimate the local probability density function in a random Ž eld wherein potential similarities between the density functions for the s...

متن کامل

Adaptive Bayesian Regression Splines in Semiparametric Generalized Linear Models

This paper presents a fully Bayesian approach to regression splines with automatic knot selection in generalized semiparametric models for fundamentally non Gaussian responses In a basis function representation of the regression spline we use a B spline basis The reversible jump Markov chain Monte Carlo method allows for simultaneous estimation both of the number of knots and the knot placement...

متن کامل

An Adaptive Estimation Method for Semiparametric Models and Dimension Reduction

Xia, Tong, Li and Zhu (2002) proposed a general estimation method termed minimum average variance estimation (MAVE) for semiparametric models. The method has been found very useful in estimating complicated semiparametric models (Xia, Zhang and Tong, 2004; Xia and Härdle, 2006) and general dimension reduction (Xia, 2008; Wang and Xia, 2008). The method is also convenient to combine with other m...

متن کامل

Ridge Stochastic Restricted Estimators in Semiparametric Linear Measurement Error Models

In this article we consider the stochastic restricted ridge estimation in semipara-metric linear models when the covariates are measured with additive errors. The development of penalized corrected likelihood method in such model is the basis for derivation of ridge estimates. The asymptotic normality of the resulting estimates are established. Also, necessary and sufficient condition...

متن کامل

Generalized Ridge Regression Estimator in Semiparametric Regression Models

In the context of ridge regression, the estimation of ridge (shrinkage) parameter plays an important role in analyzing data. Many efforts have been put to develop skills and methods of computing shrinkage estimators for different full-parametric ridge regression approaches, using eigenvalues. However, the estimation of shrinkage parameter is neglected for semiparametric regression models. The m...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Transactions of the Association for Computational Linguistics

سال: 2021

ISSN: ['2307-387X']

DOI: https://doi.org/10.1162/tacl_a_00371